AITopics | section 3

Collaborating Authors

section 3 Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Spectral Methods to Sample Complexity Bounds for Fourier Neural Operators

Chandramoorthy, Nisha, Sanz-Alonso, Daniel, Waniorek, Nathan

arXiv.org Machine LearningJul-2-2026

We establish approximation and learning guarantees for Fourier neural operators (FNOs) applied to time-$T$ solution operators of dissipative evolution equations. The analysis builds on the premise that FNOs can efficiently approximate and learn solution operators whenever these operators admit stable and accurate spectral discretizations. To formalize this idea, we introduce classes of evolution operators defined through spectral methods and derive FNO approximation bounds and polynomial sample complexity guarantees for these classes. For equations with polynomial nonlinearities, the learning rates depend primarily on the smoothness of the input space and the dimension of the physical domain. Our results hold uniformly over broad families of dissipative equations, rather than for a single fixed PDE, and apply in particular to the Navier--Stokes, Allen--Cahn, and Cahn--Hilliard equations. For equations with non-polynomial smooth nonlinearities, we prove that polynomial sample complexity still holds with rates that now additionally depend on the smoothness of the nonlinear terms and the dissipation strength. Overall, we connect classical spectral approximation theory with modern operator learning and explain when FNOs can learn nonlinear evolution operators efficiently.

artificial intelligence, machine learning, operator, (18 more...)

arXiv.org Machine Learning

2607.0032

Country: North America > United States > Illinois > Cook County > Chicago (0.40)

Genre: Research Report (0.64)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Adaptive Iterative Hard Thresholding for Online High-dimensional Quantile Regression

Zhou, Zitian, Lin, Nan

arXiv.org Machine LearningJun-30-2026

Online high-dimensional regression requires algorithms that can update sequentially while preserving structural sparsity. We propose \textit{Adaptive Iterative Hard Thresholding (AIHT)}, an online sparse-regression framework that alternates stochastic subgradient updates with adaptively scheduled hard-thresholding steps. The key idea is to separate support discovery from local refinement: early in the learning process, AIHT delays thresholding so that weak but informative coordinates have time to accumulate signal, while later it increases the projection frequency to stabilize the sparse estimator and exploit local curvature. We develop the theory for high-dimensional online quantile regression, a challenging setting in which the loss is nonsmooth and the data may exhibit heterogeneity or heavy-tailed noise. Under restricted curvature and gradient-leakage conditions, AIHT remains in an inflated sparse cone, exhibits a two-phase convergence behavior, and attains logarithmic regret for the sliding-window objective. Simulations for online quantile regression, together with threshold-scheduling ablations, support the proposed mechanism and illustrate its advantage over standard online sparse-learning baselines.

artificial intelligence, machine learning, projection, (18 more...)

arXiv.org Machine Learning

2606.28652

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Testing hypotheses via orthogonalization

Dharamshi, Ameer, Zou, Runjia, Witten, Daniela

arXiv.org Machine LearningJun-30-2026

Classical hypothesis testing frameworks break down in contemporary settings in which null hypotheses are increasingly abstract, the same data are used to both generate and test hypotheses, and minimal assumptions about the underlying data are made. In this work, we propose a new framework for conducting valid hypothesis tests in broad contexts. We propose to add and subtract external noise generated from a symmetric shift-family to our data, $X$, to partition it into two pieces, $X^{(1)}$ and $X^{(2)}$. We provide a generic strategy for orthogonalizing $X^{(2)}$ against $X^{(1)}$ under the null hypothesis $H_0$, then show that testing whether the orthogonalization was successful provides a valid test of $H_0$ under mild assumptions. Remarkably, this framework extends naturally to the post-selection inference setting: we simply select a hypothesis on $X^{(1)}$, then perform orthogonalization under the selected null. As our approach neither requires pre-specification of the selection mechanism, nor is restricted to a small class of data-generating distributions, it dramatically expands the settings for which valid post-selection inference can be conducted. We showcase the flexibility of our proposal in several case studies involving challenging pre-specified null hypotheses and post-selection inference scenarios.

artificial intelligence, hypothesis, machine learning, (18 more...)

arXiv.org Machine Learning

2606.29732

Country: North America > United States (0.67)

Genre:

Workflow (1.00)
Research Report (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Decision-Aligned Evaluation of Uncertainty Quantification

Schneider, Annika, Rochussen, Tommy, Stiller, Joshua, Fortuin, Vincent

arXiv.org Machine LearningJun-26-2026

Uncertainty estimates in machine learning are typically evaluated using generic metrics such as the negative log-likelihood and expected calibration error, yet good performance on such metrics does not necessarily imply high utility in downstream decisions. We introduce decision-alignment, a criterion that reveals which evaluation metrics meaningfully align with downstream utilities. Applying this framework, we show that many widely used uncertainty metrics are either misaligned with common decision problems or encode pathological prior beliefs about the downstream task. We then propose prior-weighted utility metrics, a special class of proper scoring rules that provides decision-aligned uncertainty evaluation. Across benchmark experiments and real-world case studies, our metrics consistently align with realized decision utility, while conventional metrics do not. Our results surface flaws in the current UQ evaluation protocol and offer a principled extension of existing metrics toward decision-relevant UQ evaluation.

alignment, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2606.2699

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Energy > Power Industry (0.93)
Banking & Finance > Loans (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
(2 more...)

Add feedback

When are likely answers right? On Sequence Probability and Correctness in LLMs

Zenn, Johannes, Geiping, Jonas

arXiv.org Machine LearningJun-26-2026

Many decoding methods for large language models can be understood as shifting probability mass toward outputs that are more likely under the model, either locally at the token level or globally at the sequence level. Therefore, their success depends on a fundamental question: when does sequence probability, that is, the conditional probability of a continuation given a prompt, actually align with correctness? In this paper, we set out to quantify this relationship across decoding methods, models, and benchmarks at four levels: across decoding methods, across hyperparameters within a method, across prompt-answer pairs within a dataset, and across repeated responses to the same prompt. We find that higher sequence probability is often predictive of correctness across prompt-answer pairs within a fixed dataset. However, this relationship does not generally transfer to decoding decisions: increasing sequence probability by changing hyperparameters or methods does not reliably improve accuracy. Further, sequence probability is not a good indicator of correctness for responses to the same prompt. These findings clarify when decoding can and cannot be expected to improve correctness, and provide practical guidance for decoding, self-consistency, and verifier-free self-improvement.

correlation, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2606.27359

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

Improving Deep Learning for Accelerated MRI With Data Filtering

Reinhard Heckel

Neural Information Processing SystemsJun-23-2026, 03:43:48 GMT

Deep neural networks achieve state-of-the-art results for accelerated MRI reconstruction. Most research on deep learning based imaging focuses on improving neural network architectures trained and evaluated on fixed and homogeneous training and evaluation data. In this work, we investigate data curation strategies for improving MRI reconstruction. We assemble a large dataset of raw k-space data from 18 public sources consisting of 1.1M images and construct a diverse evaluation set comprising 48 test sets, capturing variations in anatomy, contrast, number of coils, and other key factors. We propose and study different data filtering strategies to enhance performance of current state-of-the-art neural networks for accelerated MRI reconstruction. Our experiments show that filtering the training data leads to consistent, albeit modest, performance gains. These performance gains are robust across different training set sizes and accelerations, and we find that filtering is particularly beneficial when the proportion of in-distribution data in the unfiltered training set is low.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Spatial-Aware Decision-Making with Ring Attractors in Reinforcement Learning Systems

Neural Information Processing SystemsJun-23-2026, 02:17:19 GMT

Ring attractors, mathematical models inspired by neural circuit dynamics, provide a biologically plausible mechanism to improve learning speed and accuracy in Reinforcement Learning (RL). Serving as specialized brain-inspired structures that encode spatial information and uncertainty, ring attractors explicitly encode the action space, facilitate the organization of neural activity, and enable the distribution of spatial representations across the neural network in the context of Deep Reinforcement Learning (DRL). These structures also provide temporal filtering that stabilizes action selection during exploration, for example, by preserving the continuity between rotation angles in robotic control or adjacency between tactical moves in game-like environments. The application of ring attractors in the action selection process involves mapping actions to specific locations on the ring and decoding the selected action based on neural activity. We investigate the application of ring attractors by both building an exogenous model and integrating them as part of DRL agents. Our approach significantly improves state-of-the-art performance on the Atari 100k benchmark, achieving a 53% increase in performance over selected baselines.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.67)

Industry:

Leisure & Entertainment > Games (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Federated Multi-armed Bandits with Efficient Bit-Level Communications

Neural Information Processing SystemsJun-23-2026, 00:15:38 GMT

In this work, we study the federated multi-armed bandit (FMAB) problem, where a set of agents collaboratively aim to minimize cumulative regret. Unlike traditional centralized bandit models, agents in FMAB settings are connected via a communication graph and cannot share data freely due to bandwidth limitations or privacy constraints. This raises a fundamental challenge: how to achieve optimal learning performance under stringent communication budgets. We propose a novel communication-efficient algorithm containing two points: one for eliminating suboptimal arms through early and frequent communication of key decisions, and the other for refining global estimates using incremental epoch, quantized, and differentially transmitted statistics. Incremental Epoch-based Successive Elimination Algorithm (EpoInc-SE) is presented by carefully balancing communication frequency and precision of global estimates. Theoretically, we derive tight upper bounds on both individual cumulative regret and group regret, and prove that our method asymptotically matches the lower bound of regret in federated settings.

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology (0.67)
Education (0.46)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.48)

Add feedback

Kernel of Partition Paths: A Unified Representation for Tree Ensembles

Mahler, Nicolas

arXiv.org Machine LearningJun-23-2026

A recent line of work has reframed individual decision trees as linear models on engineered features associated with their splits, opening routes for oracle inequalities and featureimportance reinterpretation, but leaving open the question of what unified geometric object a forest induces when one indexes its feature map by nodes rather than by splits. The present paper studies that object. KPP indexes the feature map by the nodes of the forest, weighted by a path metric that turns each coordinate into a component of a squared-Euclidean pathisometric embedding. KPP unifies four pillars under a single node-indexed representation whose Gram is non-diagonal and carries a metric: prediction, exact additive attribution, deterministic Lipschitz robust radius in the KPP metric, and uniform Rademacher risk bounds for regression and classification under fixed, honest, or cross-fit conditioning. All probabilistic guarantees are conditional on the representation and are stated under three explicit conditioning regimes; the robust-radius guarantee is deterministic in the KPP metric rather than in a norm on the raw input. Conjectured fast-rate refinements for both regression and classification are stated as open problems and are not claimed as theorems.

artificial intelligence, machine learning, representation, (19 more...)

arXiv.org Machine Learning

2606.18853

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

SAGE: A Novelty Gate for Efficient Memory Evolution in Agentic LLMs

Wang, Sijia, Brahma, Dhanajit, Henao, Ricardo

arXiv.org Machine LearningJun-23-2026

Agentic LLMs must continuously decide whether newly extracted facts should be added, merged with existing memories, or ignored, yet prior work has focused more on retrieval and storage than on principled write-side control. We frame memory evolution as a novelty-detection problem and propose SAGE, a Spherical Adaptive Gate for memory Evolution that scores candidate facts with a von Mises-Fisher-based density estimator over memory embeddings and routes them with an adaptive threshold that tracks memory-store geometry. SAGE resolves clearly novel facts as ADD, clearly redundant facts as NOOP, and sends only uncertain cases to an LLM merge step, reducing expensive write-time reasoning. On LoCoMo, SAGE achieves the best average token-F1 against Mem0 on all seven open-weight backbone comparisons, while on GPT-4o-mini it reduces add-phase API cost by 3.4$\times$ and add-phase latency by 2.5$\times$ with only a small average judge-score gap. As a drop-in binary gate for A-Mem, SAGE skips roughly 16-18% of LLM calls across five models with minimal quality change on open-weight backbones. These results suggest that novelty-aware write control is a practical lever for improving both memory quality and system efficiency in long-term agentic memory. The source code for our approach is accessible at https://github.com/swang1024/SAGE.

large language model, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2605.30711

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback